<?xml version="1.0" encoding="utf-8"?>

<overlay xmlns="http://hoa-project.net/xyl/xylophone">
<yield id="chapter">

  <p>Strings can sometimes be <strong>complex</strong>, especially when they use
  the <code>Unicode</code> encoding format. The <code>Hoa\Ustring</code> library
  provides several operations on UTF-8 strings.</p>

  <h2 id="Table_of_contents">Table of contents</h2>

  <tableofcontents id="main-toc" />

  <h2 id="Introduction" for="main-toc">Introduction</h2>

  <p>When we manipulate strings, the <a href="http://unicode.org/">Unicode</a>
  format establishes itself because of its <strong>compatibility</strong> with
  historical formats (like ASCII) and its capacity to understand a
  <strong>large</strong> range of characters and symbols for all cultures and
  all regions in the world. PHP provides several tools to manipulate such
  strings, like the following extensions:
  <a href="http://php.net/mbstring"><code>mbstring</code></a>,
  <a href="http://php.net/iconv"><code>iconv</code></a> or also the excellent
  <a href="http://php.net/intl"><code>intl</code></a>  which is based on
  <a href="http://icu-project.org/">ICU</a>, the reference implementation of
  Unicode. Unfortunately, sometimes we have to mix these extensions to achieve
  our aims and at the cost of a certain <strong>complexity</strong> along with
  a regrettable <strong>verbosity</strong>.</p>
  <p>The <code>Hoa\Ustring</code> library answers to these issues by providing a
  <strong>simple</strong> way to manipulate strings with
  <strong>performance</strong> and <strong>efficiency</strong> in minds. It
  also provides some evoluated algorithms to perform <strong>search</strong>
  operations on strings.</p>

  <h2 id="Unicode_strings" for="main-toc">Unicode strings</h2>

  <p>The <code>Hoa\Ustring\Ustring</code> class represents a
  <strong>UTF-8</strong> Unicode strings and allows to manipulate it easily.
  This class implements the
  <a href="http://php.net/arrayaccess"><code>ArrayAccess</code></a>,
  <a href="http://php.net/countable"><code>Countable</code></a> and
  <a href="http://php.net/iteratoraggregate"><code>IteratorAggregate</code></a>
  interfaces. We are going to use three examples in three different languages:
  French, Arab and Japanese. Thus:</p>
  <pre><code class="language-php">$french   = new Hoa\Ustring\Ustring('Je t\'aime');
$arabic   = new Hoa\Ustring\Ustring('أحبك');
$japanese = new Hoa\Ustring\Ustring('私はあなたを愛して');</code></pre>
  <p>Now, let's see what we can do on these three strings.</p>

  <h3 id="String_manipulation" for="main-toc">String manipulation</h3>

  <p>Let's start with <strong>elementary</strong> operations. If we would like
  to <strong>count</strong> the number of characters (not bytes), we will use
  the <a href="http://php.net/count"><code>count</code> function</a>. Thus:</p>
  <pre><code class="language-php">var_dump(
    count($french),
    count($arabic),
    count($japanese)
);

/**
 * Will output:
 *     int(9)
 *     int(4)
 *     int(9)
 */</code></pre>
  <p>When we speak about text position, it is not suitable to speak about the
  right or the left, but rather about a <strong>beginning</strong> or an
  <strong>end</strong>, and based on the <strong>direction</strong> of writing.
  We can know this direction thanks to the
  <code>Hoa\Ustring\Ustring::getDirection</code> method. It returns the value of
  one of the following constants:</p>
  <ul>
    <li><code>Hoa\Ustring\Ustring::LTR</code>, for left-to-right, if the text is
    written from the left to the right,</li>
    <li><code>Hoa\Ustring\Ustring::RTL</code>, for right-to-left, if the text is
    written from the right to the left.</li>
  </ul>
  <p>Let's observe the result with our examples:</p>
  <pre><code class="language-php">var_dump(
    $french->getDirection()   === Hoa\Ustring\Ustring::LTR, // is left-to-right?
    $arabic->getDirection()   === Hoa\Ustring\Ustring::RTL, // is right-to-left?
    $japanese->getDirection() === Hoa\Ustring\Ustring::LTR  // is left-to-right?
);

/**
 * Will output:
 *     bool(true)
 *     bool(true)
 *     bool(true)
 */</code></pre>
  <p>The result of this method is computed thanks to the
  <code>Hoa\Ustring\Ustring::getCharDirection</code> static method which computes
  the direction of only one character.</p>
  <p>If we would like to <strong>concatenate</strong> another string to the end
  or to the beginning, we will respectively use the
  <code>Hoa\Ustring\Ustring::append</code> and
  <code>Hoa\Ustring\Ustring::prepend</code> methods. These methods, like most of
  the ones which modifies the string, return the object itself, in order to
  chain the calls. For instance:</p>
  <pre><code class="language-php">echo $french->append('… et toi, m\'aimes-tu ?')->prepend('Mam\'zelle ! ');

/**
 * Will output:
 *     Mam'zelle ! Je t'aime… et toi, m'aimes-tu ?
 */</code></pre>
  <p>We also have the <code>Hoa\Ustring\Ustring::toLowerCase</code> and
  <code>Hoa\Ustring\Ustring::toUpperCase</code> methods to, respectively, set
  the case of the string to lower or upper. For instance:</p>
  <pre><code class="language-php">echo $french->toUpperCase();

/**
 * Will output:
 *     MAM'ZELLE ! JE T'AIME… ET TOI, M'AIMES-TU ?
 */</code></pre>
  <p>We can also add characters to the beginning or to the end of the string to
  reach a <strong>minimum</strong> length. This operation is frequently called
  the <em>padding</em> (for historical reasons dating back to typewriters).
  That's why we have the <code>Hoa\Ustring\Ustring::pad</code> method which
  takes three arguments: the minimum length, characters to add and a constant
  indicating whether we have to add at the end or at the beginning of the string
  (respectively <code>Hoa\Ustring\Ustring::END</code>, by default, and
  <code>Hoa\Ustring\Ustring::BEGINNING</code>).</p>
  <pre><code class="language-php">echo $arabic->pad(20, ' ');

/**
 * Will output:
 *                     أحبك
 */</code></pre>
  <p>A similar operation allows to remove, by default, <strong>spaces</strong>
  at the beginning and at the end of the string thanks to the
  <code>Hoa\Ustring\Ustring::trim</code> method. For example, to retreive our
  original Arabic string:</p>
  <pre><code class="language-php">echo $arabic->trim();

/**
 * Will output:
 *     أحبك
 */</code></pre>
  <p>If we would like to remove other characters, we can use its first argument
  which must be a regular expression. Finally, its second argument allows to
  specify from what side we would like to remove character: at the beginning, at
  the end or both, still by using the
  <code>Hoa\Ustring\Ustring::BEGINNING</code> and
  <code>Hoa\Ustring\Ustring::END</code> constants.</p>
  <p>If we would like to remove other characters, we can use its first argument
  which must be a regular expression. Finally, its second argument allows to
  specify the side where to remove characters: at the beginning, at the end or
  both, still by using the <code>Hoa\Ustring\Ustring::BEGINNING</code> and
  <code>Hoa\Ustring\Ustring::END</code> constants. We can combine these
  constants to express “both sides”, which is the default value:
  <code class="language-php">Hoa\Ustring\Ustring::BEGINNING |
  Hoa\Ustring\Ustring::END</code>. For example, to remove all the numbers and
  the spaces only at the end, we will write:</p>
  <pre><code class="language-php">$arabic->trim('\s|\d', Hoa\Ustring\Ustring::END);</code></pre>
  <p>We can also <strong>reduce</strong> the string to a
  <strong>sub-string</strong> by specifying the position of the first character
  followed by the length of the sub-string to the
  <code>Hoa\Ustring\Ustring::reduce</code> method:</p>
  <pre><code class="language-php">echo $french->reduce(3, 6)->reduce(2, 4);

/**
 * Will output:
 *     aime
 */</code></pre>
  <p>If we would like to get a specific character, we can rely on the
  <code>ArrayAccess</code> interface. For instance, to get the first character
  of each of our examples (from their original definitions):</p>
  <pre><code class="language-php">var_dump(
    $french[0],
    $arabic[0],
    $japanese[0]
);

/**
 * Will output:
 *     string(1) "J"
 *     string(2) "أ"
 *     string(3) "私"
 */</code></pre>
  <p>If we would like the last character, we will use the -1 index. The index is
  not bounded to the length of the string. If the index exceeds this length,
  then a <em>modulo</em> will be applied.</p>
  <p>We can also modify or remove a specific character with this method. For
  example:</p>
  <pre><code class="language-php">$french->append(' ?');
$french[-1] = '!';
echo $french;

/**
 * Will output:
 *     Je t'aime !
 */</code></pre>
  <p>Another very useful method is the <strong>ASCII</strong> transformation.
  Be careful, this is not always possible, according to your settings. For
  example:</p>
  <pre><code class="language-php">$title = new Hoa\Ustring\Ustring('Un été brûlant sur la côte');
echo $title->toAscii();

/**
 * Will output:
 *     Un ete brulant sur la cote
 */</code></pre>
  <p>We can also transform from Arabic or Japanese to ASCII. Symbols, like
  Mathemeticals symbols or emojis, are also transformed:</p>
  <pre><code class="language-php">$emoji = new Hoa\Ustring\Ustring('I ❤ Unicode');
$maths = new Hoa\Ustring\Ustring('∀ i ∈ ℕ');

echo
    $arabic->toAscii(), "\n",
    $japanese->toAscii(), "\n",
    $emoji->toAscii(), "\n",
    $maths->toAscii(), "\n";

/**
 * Will output:
 *     ahbk
 *     sihaanatawo aishite
 *     I (heavy black heart)️ Unicode
 *     (for all) i (element of) N
 */</code></pre>
  <p>In order this method to work correctly, the
  <a href="http://php.net/intl"><code>intl</code></a> extension needs to be
  present, so that the
  <a href="http://php.net/transliterator"><code>Transliterator</code></a> class
  is present. If it does not exist, the
  <a href="http://php.net/normalizer"><code>Normalizer</code></a> class must
  exist. If this class does not exist neither, the
  <code>Hoa\Ustring\Ustring::toAscii</code> method can still try a
  transformation, but it is less efficient. To activate this last solution,
  <code>true</code> must be passed as a single argument. This <em lang="fr">tour
  de force</em> is not recommended in most cases.</p>
  <p>We also find the <code>getTransliterator</code> method which returns a
  <code>Transliterator</code> object, or <code>null</code> if this class does
  not exist. This method takes a transliteration identifier as argument. We
  suggest to <a href="http://userguide.icu-project.org/transforms/general">read
  the documentation about the transliterator of ICU</a> to understand this
  identifier. The <code>transliterate</code> method allows to transliterate the
  current string based on an identifier and a beginning index and an end
  one. This method works the same way than the
  <a href="http://php.net/transliterator.transliterate"><code>Transliterator::transliterate</code></a>
  method.</p>
  <p>More generally, to change the <strong>encoding</strong> format, we can use
  the <code>Hoa\Ustring\Ustring::transcode</code> static method, with a string
  as first argument, the original encoding format as second argument and the
  expected encoding format as third argument (UTF-8 by default). The get the
  list of encoding formats, we have to refer to the
  <a href="http://php.net/iconv"><code>iconv</code></a> extension or to use the
  following command line in a terminal:</p>
  <pre><code class="language-php">$ iconv --list</code></pre>
  <p>To know if a string is encoded in UTF-8, we can use the
  <code>Hoa\Ustring\Ustring::isUtf8</code> static method; for instance:</p>
  <pre><code class="language-php">var_dump(
    Hoa\Ustring\Ustring::isUtf8('a'),
    Hoa\Ustring\Ustring::isUtf8(Hoa\Ustring\Ustring::transcode('a', 'UTF-8', 'UTF-16'))
);

/**
 * Will output:
 *     bool(true)
 *     bool(false)
 */</code></pre>
  <p>We can <strong>split</strong> the string into several sub-strings by using
  the <code>Hoa\Ustring\Ustring::split</code> method. As first argument, we have
  a regular expression (of kind <a href="http://pcre.org/">PCRE</a>), then an
  integer representing the maximum number of elements to return and finally a
  combination of constants. These constants are the same as the ones of
  <a href="http://php.net/preg_split"><code>preg_split</code></a>.</p>
  <p>By default, the second argument is set to -1, which means infinity, and the
  last argument is set to <code>PREG_SPLIT_NO_EMPTY</code>. Thus, if we would
  like to get all the words of a string, we will write:</p>
  <pre><code class="language-php">print_r($title->split('#\b|\s#'));

/**
 * Will output:
 *     Array
 *     (
 *         [0] => Un
 *         [1] => ete
 *         [2] => brulant
 *         [3] => sur
 *         [4] => la
 *         [5] => cote
 *     )
 */</code></pre>
  <p>If we would like to <strong>iterate</strong> over all the
  <strong>characters</strong>, it is recommended to use the
  <code>IteratorAggregate</code> method, being the
  <code>Hoa\Ustring\Ustring::getIterator</code> method. Let's see on the Arabic
  example:</p>
  <pre><code class="language-php">foreach ($arabic as $letter) {
    echo $letter, "\n";
}

/**
 * Will output:
 *     أ
 *     ح
 *     ب
 *     ك
 */</code></pre>
  <p>We notice that the iteration is based on the text direction, it means that
  the first element of the iteration is the first letter of the string starting
  from the beginning.</p>
  <p>Of course, if we would like to get an array of characters, we can use the
  <a href="http://php.net/iterator_to_array"><code>iterator_to_array</code></a>
  PHP function:</p>
  <pre><code class="language-php">print_r(iterator_to_array($arabic));

/**
 * Will output:
 *     Array
 *     (
 *         [0] => أ
 *         [1] => ح
 *         [2] => ب
 *         [3] => ك
 *     )
 */</code></pre>

  <h3 id="Comparison_and_search" for="main-toc">Comparison and search</h3>

  <p>Strings can also be <strong>compared</strong> thanks to the
  <code>Hoa\Ustring\Ustring::compare</code> method:</p>
  <pre><code class="language-php">$string = new Hoa\Ustring\Ustring('abc');
var_dump(
    $string->compare('wxyz')
);

/**
 * Will output:
 *     string(-1)
 */</code></pre>
  <p>This methods returns -1 if the initial string comes before (in the
  alphabetical order), 0 if it is identical and 1 if it comes after. If we
  would like to use all the power of the underlying mechanism, we can call the
  <code>Hoa\Ustring\Ustring::getCollator</code> static method (if the
  <a href="http://php.net/Collator"><code>Collator</code></a> class exists, else
  <code>Hoa\Ustring\Ustring::compare</code> will use a simple byte to bytes
  comparison without taking care of the other parameters). Thus, if we would
  like to sort an array of strings, we will write:</p>
  <pre><code class="language-php">$strings = array('c', 'Σ', 'd', 'x', 'α', 'a');
Hoa\Ustring\Ustring::getCollator()->sort($strings);
print_r($strings);

/**
 * Could output:
 *     Array
 *     (
 *         [0] => a
 *         [1] => c
 *         [2] => d
 *         [3] => x
 *         [4] => α
 *         [5] => Σ
 *     )
 */</code></pre>
  <p>Comparison between two strings depends on the <strong>locale</strong>, it
  means of the localization of the system, like the language, the country, the
  region etc. We can use the
  <a href="@hack:chapter=Locale"><code>Hoa\Locale</code> library</a> to modify
  these data, but it's not a dependence of <code>Hoa\Ustring</code>.</p>
  <p>We can also know if a string <strong>matches</strong> a certain pattern,
  still expressed with a regular expression. To achieve that, we will use the
  <code>Hoa\Ustring\Ustring::match</code> method. This method relies on the
  <a href="http://php.net/preg_match"><code>preg_match</code></a> and
  <a href="http://php.net/preg_match_all"><code>preg_match_all</code></a> PHP
  functions, but by modifying the pattern's options to ensure the Unicode
  support. We have the following parameters: the pattern, a variable passed by
  reference to collect the matches, flags, an offset and finally a boolean
  indicating whether the search is global or not (respectively if we have to use
  <code>preg_match_all</code> or <code>preg_match</code>). By default, the
  search is not global.</p>
  <p>Thus, we will check that our French example contains <code>aime</code> with
  a direct object complement:</p>
  <pre><code class="language-php">$french->match('#(?:(?&amp;lt;direct_object>\w)[\'\b])aime#', $matches);
var_dump($matches['direct_object']);

/**
 * Will output:
 *     string(1) "t"
 */</code></pre>
  <p>This method returns <code>false</code> if an error is raised (for example
  if the pattern is not correct), 0 if no match has been found, the number of
  matches else.</p>
  <p>Similarly, we can <strong>search</strong> and <strong>replace</strong>
  sub-strings by other sub-strings based on a pattern, still expressed with a
  regular expression. To achieve that, we will use the
  <code>Hoa\Ustring\Ustring::replace</code> method. This method uses the
  <a href="http://php.net/preg_replace"><code>preg_replace</code></a> and
  <a href="http://php.net/preg_replace_callback"><code>preg_replace_callback</code></a>
  PHP functions, but still by modifying the pattern's options to ensure the
  Unicode support. As first argument, we find one or more patterns, as second
  argument, one or more replacements and as last argument the limit of
  replacements to apply. If the replacement is a callable, then the
  <code>preg_replace_callback</code> function will be used.</p>
  <p>Thus, we will modify our French example to be more polite:</p>
  <pre><code class="language-php">$french->replace('#(?:\w[\'\b])(?&amp;lt;verb>aime)#', function ($matches) {
    return 'vous ' . $matches['verb'];
});

echo $french;

/**
 * Will output:
 *     Je vous aime
 */</code></pre>
  <p>The <code>Hoa\Ustring\Ustring</code> class provides constants which are
  aliases of existing PHP constants and ensure a better readability of the
  code:</p>
  <ul>
    <li><code>Hoa\Ustring\Ustring::WITHOUT_EMPTY</code>, alias of
    <code>PREG_SPLIT_NO_EMPTY</code>,</li>
    <li><code>Hoa\Ustring\Ustring::WITH_DELIMITERS</code>, alias of
    <code>PREG_SPLIT_DELIM_CAPTURE</code>,</li>
    <li><code>Hoa\Ustring\Ustring::WITH_OFFSET</code>, alias of
    <code>PREG_OFFSET_CAPTURE</code> and
    <code>PREG_SPLIT_OFFSET_CAPTURE</code>,</li>
    <li><code>Hoa\Ustring\Ustring::GROUP_BY_PATTERN</code>, alias of
    <code>PREG_PATTERN_ORDER</code>,</li>
    <li><code>Hoa\Ustring\Ustring::GROUP_BY_TUPLE</code>, alias of
    <code>PREG_SET_ORDER</code>.</li>
  </ul>
  <p>Because they are strict aliases, we can write:</p>
  <pre><code class="language-php">$string = new Hoa\Ustring\Ustring('abc1 defg2 hikl3 xyz4');
$string->match(
    '#(\w+)(\d)#',
    $matches,
    Hoa\Ustring\Ustring::WITH_OFFSET
  | Hoa\Ustring\Ustring::GROUP_BY_TUPLE,
    0,
    true
);</code></pre>

  <h3 id="Characters" for="main-toc">Characters</h3>

  <p>The <code>Hoa\Ustring\Ustring</code> class offers static methods working on
  a single Unicode character. We have already mentionned the
  <code>getCharDirection</code> method which allows to know the
  <strong>direction</strong> of a character. We also have the
  <code>getCharWidth</code> which counts the <strong>number of columns</strong>
  necessary to print a single character. Thus:</p>
  <pre><code class="language-php">var_dump(
    Hoa\Ustring\Ustring::getCharWidth(Hoa\Ustring\Ustring::fromCode(0x7f)),
    Hoa\Ustring\Ustring::getCharWidth('a'),
    Hoa\Ustring\Ustring::getCharWidth('㽠')
);

/**
 * Will output:
 *     int(-1)
 *     int(1)
 *     int(2)
 */</code></pre>
  <p>This method returns -1 or 0 if the character is not
  <strong>printable</strong> (for instance, if this is a control character, like
  <code>0x7f</code> which corresponds to <code>DELETE</code>), 1 or more if this
  is a character that can be printed. In our example, <code>㽠</code> requires
  2 columns to be printed.</p>
  <p>To get more semantics, we have the
  <code>Hoa\Ustring\Ustring::isCharPrintable</code> method which allows to know
  whether a character is printable or not.</p>
  <p>If we would like to count the number of columns necessary for a whole
  string, we have to use the <code>Hoa\Ustring\Ustring::getWidth</code> method.
  Thus:</p>
  <pre><code class="language-php">var_dump(
    $french->getWidth(),
    $arabic->getWidth(),
    $japanese->getWidth()
);

/**
 * Will output:
 *     int(9)
 *     int(4)
 *     int(18)
 */</code></pre>
  <p>Try this in your terminal with a <strong>monospaced</strong> font. You will
  observe that Japanese requires 18 columns to be printed. This measure is very
  useful if we would like to know the length of a string to position it
  efficiently.</p>
  <p>The <code>getCharWidth</code> method is different of <code>getWidth</code>
  because it includes control characters. This method is intended to be used,
  for example, with terminals (please, see the
  <a href="@hack:chapter=Console"><code>Hoa\Console</code> library</a>).</p>
  <p>Finally, if this time we are not interested by Unicode characters but
  rather by <strong>machine</strong> characters <code>char</code> (being
  1 byte), we have an extra operation. The
  <code>Hoa\Ustring\Ustring::getBytesLength</code> method will count the
  <strong>length</strong> of the string in bytes:</p>
  <pre><code class="language-php">var_dump(
    $arabic->getBytesLength(),
    $japanese->getBytesLength()
);

/**
 * Will output:
 *     int(8)
 *     int(27)
 */</code></pre>
  <p>If we compare these results with the ones of the
  <code>Hoa\Ustring\Ustring::count</code> method, we understand that the Arabic
  characters are encoded with 2 bytes whereas Japanese characteres are encoded
  with 3 bytes. We can also get a specific byte thanks to the
  <code>Hoa\Ustring\Ustring::getByteAt</code> method. Once again, the index is
  not bounded.</p>

  <h3 id="Code-point" for="main-toc">Code-point</h3>

  <p>Each character is represented by an integer, called a
  <strong>code-point</strong>. To get the code-point of a character, we can
  use the <code>Hoa\Ustring\Ustring::toCode</code> static method, and to get a
  character based on its code-point, we can use the
  <code>Hoa\Ustring\Ustring::fromCode</code> static method. We also have the
  <code>Hoa\Ustring\Ustring::toBinaryCode</code> method which returns the binary
  representation of a character. Let's take an example:</p>
  <pre><code class="language-php">var_dump(
    Hoa\Ustring\Ustring::toCode('Σ'),
    Hoa\Ustring\Ustring::toBinaryCode('Σ'),
    Hoa\Ustring\Ustring::fromCode(0x1a9)
);

/**
 * Will output:
 *     int(931)
 *     string(32) "1100111010100011"
 *     string(2) "Σ"
 */</code></pre>

  <h2 id="Search_algorithms" for="main-toc">Search algorithms</h2>

  <p>The <code>Hoa\Ustring</code> library provides sophisticated
  <strong>search</strong> algorithms on strings through the
  <code>Hoa\Ustring\Search</code> class.</p>
  <p>We will study the <code>Hoa\Ustring\Search::approximated</code> algorithm
  which searches a sub-string in a string up to <strong><em>k</em>
  differences</strong> (a difference is an addition, a deletion or a
  modification). Let's take the classical example of a DNA representation: We
  will search all the sub-strings approximating <code>GATAA</code> with
  1 difference (maximum) in <code>CAGATAAGAGAA</code>. So, we will write:</p>
  <pre><code class="language-php">$x      = 'GATAA';
$y      = 'CAGATAAGAGAA';
$k      = 1;
$search = Hoa\Ustring\Search::approximated($y, $x, $k);
$n      = count($search);

echo 'Try to match ', $x, ' in ', $y, ' with at most ', $k, ' difference(s):', "\n";
echo $n, ' match(es) found:', "\n";

foreach ($search as $position) {
    echo '    • ', substr($y, $position['i'], $position['l'), "\n";
}

/**
 * Will output:
 *     Try to match GATAA in CAGATAAGAGAA with at most 1 difference(s):
 *     4 match(es) found:
 *         • AGATA
 *         • GATAA
 *         • ATAAG
 *         • GAGAA
 */</code></pre>
  <p>This methods returns an array of arrays. Each sub-array represents a result
  and contains three indexes: <code>i</code> for the position of the first
  character (byte) of the result, <code>j</code> for the position of the last
  character and <code>l</code> for the length of the result (simply
  <code>j</code> - <code>i</code>). Thus, we can compute the results by using
  our initial string (here <code class="language-php">$y</code>) and its
  indexes.</p>
  <p>With our example, we have four results. The first is <code>AGATA</code>,
  being <code>GATA<em>A</em></code> with one moved character, and
  <code>AGATA</code> exists in <code>C<em>AGATA</em>AGAGAA</code>.  The second
  result is <code>GATAA</code>, our sub-string, which well and truly exists in
  <code>CA<em>GATAA</em>GAGAA</code>. The third result is <code>ATAAG</code>,
  being <code><em>G</em>ATAA</code> with one moved character, and
  <code>ATAAG</code> exists in <code>CAG<em>ATAAG</em>AGAA</code>. Finally, the
  last result is <code>GAGAA</code>, being <code>GA<em>T</em>AA</code> with one
  modified character, and <code>GAGAA</code> exists in
  <code>CAGATAA<em>GAGAA</em></code>.</p>
  <p>Another example, more concrete this time. We will consider the
  <code>--testIt --foobar --testThat --testAt</code> string (which represents
  possible options of a command line), and we will search <code>--testot</code>,
  an option that should have been given by the user. This option does not exist
  as it is. We will then use our search algorithm with at most 1 difference.
  Let's see:</p>
  <pre><code class="language-php">$x      = 'testot';
$y      = '--testIt --foobar --testThat --testAt';
$k      = 1;
$search = Hoa\Ustring\Search::approximated($y, $x, $k);
$n      = count($search);

// …

/**
 * Will output:
 *     Try to match testot in --testIt --foobar --testThat --testAt with at most 1 difference(s)
 *     2 match(es) found:
 *         • testIt
 *         • testAt
 */</code></pre>
  <p>The <code>testIt</code> and <code>testAt</code> results are true options,
  so we can suggest them to the user. This is a mechanism user by
  <code>Hoa\Console</code> to suggest corrections to the user in case of a
  mistyping.</p>

  <h2 id="Conclusion" for="main-toc">Conclusion</h2>

  <p>The <code>Hoa\Ustring</code> library provides facilities to manipulate
  strings encoded with the Unicode format, but also to make sophisticated search
  on strings.</p>

</yield>
</overlay>
